Altruistic Duality in Evolutionary Game Theory 
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A game-theoretic dynamical model of social preference and enlightened self-interest is formulated. 
Existence of symmetry and duality in the game matrices with altruistic social preference is revealed. 
The model quantitatively describes the dynamical evolution of altruism in prisoner's dilemma and 
the regime change in prey-predator dynamics. 
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Through the modeling of ecosystems, evolutionary 
game theory brings such diverse fields as biology, ecology, 
economics and sociology under the umbrella of mathe- 
matical sciences 0,0,0- One central objective of evo- 
lutionary game theory is to understand the workings of 
cooperative behavior among the individuals in an ecosys- 
tem. Since the publication of the work by Axelrot on the 
prisoner's dilemma Q, it is generally understood that the 
concept of altruism holds a key to the emergence of coop- 
erative behavior. While egoism, defined as the drive to- 
ward maximization of individual payoffs, is a cornerstone 
of game theory, casual observation reveals that altruism 
is just as universal a feature as egoism in systems con- 
sisting of like individuals. Altruism has obtained mathe- 
matical expression in the work of Bester and Giith |5j , in 
which altruistic behavior is shown to become evolutionar- 
ily stable in certain situations through the enhancement 
of fitness of a majority of individuals in a system. While 
their mathematical treatment is general and elegant, it is 
formulated in static and descriptive language. Bringing 
in dynamics to the model would give it more predictive 
power. 

In this article, we do not try to explain the emergence 
of altruistic cooperation. Rather, we intend to develop 
a game-theoretic model of ecosystem whose evolution is 
driven by the development of an optimal degree of al- 
truism. Toward this goal, the separation of two time 
scales, one for the fast variation of dynamical variables, 
and the other for the secular variation of "environmen- 
tal" coefficients, proves to be crucial |7j. The formulation 
of our model in terms of parametric game matrices re- 
veals symmetry properties of altruistic game theory. We 
demonstrate the usefulness of our approach through nu- 
merical analyses of a game of prisoner's dilemma and a 
prey-predator system. In the latter example, we point 
out the existence of a regime change phenomenon caused 
by dynamical symmetry breaking. 

We start by considering a system of N identical indi- 
viduals randomly paired to repeatedly play a two-player 
game with M+l pure strategies. We introduce the payoff 



matrix A and the system average strategy vector x, 

A = {Ay} (i,j = 0,...,M), (1) 
x = { Xi } (i = 0,...,M). 

All entries in A and x are real numbers, and the relation 
xq = 1 — Xi — ... — xm is imposed. It is convenient to 
consider x as column vector so that the matrix product 
Ax is again a column vector. We index the elements of 
A such that a player with i-th strategy playing against 
another player with j-th strategy will obtain the payoff 
Aij. We interpret x either as the system being made 
up of XiN players playing i-th strategy, or alternatively 
as N individuals adopting identical mixed strategies in 
which the probability of playing i-th. strategy is given by 
Xi. A player with a mixed strategy specified by a vector 
s in the system obtains the payoff 



(s|j4|x) = s^x = s^A, x). 



(2) 
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In the second equality, the payoff vector is defined as 
p(A, x)=Ax, whose i-th element represents the payoff of 
a player with i-th pure strategy. We can average (s|v4|x) 
over the entire system by the identification s = x, and 
obtain the average per capita payoff of the system II (x) = 
(x|^4.|x). In spite of the use of bra-ket notations (s| and 
|x), an obvious adoption from the quantum mechanics, 
all entries to the vectors are real numbers representing 
probabilities, and there should be no confusion on the 
fact that we are dealing with classical game theory. 

As is well known, the best strategy for a game among 
players seeking immediate individual payoff maximiza- 
tion is given by the mixed Nash equilibrium 6] of the 
matrix A. This equilibrium, however, does not always 
give maximization of the system average payoff. A sys- 
tem consisting of players with longer view on their payoff 
often has a higher average payoff than a Hobbsian system 
consisting of narrowly egoistic players. To describe such 
"enlightened self-interest" , we follow Bester and Giith 
|3, and separate the process of reproduction from that 
of selection: The players switch strategics in pursuit of 
the maximization of a range of perceived payoffs which 
are related, but not necessarily identical to the real pay- 
off. The deviation of perceived from real payoffs could 
represent imperfect information, socially imposed norm, 
or just error and caprice. The system is then assumed to 
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be under slow selection process during which the players 
with inferior (real) payoff are pruned off. To formulate 
such two-stage evolution in a simple manner, we define 
a one-parameter family of matrix A K , which we call the 
game matrix with the preference parameter k. We as- 
sume that A K reduces to the original payoff matrix A 
with k = 0. The specific example we study is the game 
matrix 

A K = (1 -k)A + kA\ kg [0,1], (3) 

where A^ is the transposed matrix of A defined by 

At = {4} (i,i = 0,...,M); 4=^. (4) 

Notice that, in all instances in this article, superscript k 
on game matrix A and strategy vector x signify the pref- 
erence parameter, not the exponentiation. Be prepared 
to see the notations A — A, A 1 = At etc.. The meaning 
of At becomes evident by considering its payoff vector 
p(Af,x) = Atx, whose i-th component is the average 
payoff yielded to the opponent by a player with the pure 
strategy i. Therefore, we call A* the altruistic dual ma- 
trix of the original payoff matrix A. Obvously we have 
a relation (x|A|x) =^x|At|x), which simply means that 
per capita system payoff can be calculated from the pay- 
off obtained by players, or from the payoff contributed 
by him to the rest of the system. We can generalize this 
result for A K as A K ^ = A 1 ^ K , namely A 1_K is the altru- 
istic dual matrix of A K . Per capita system payoff n(x) 
can be calculated from A K with any allowed value of k: 

II(x) = (x|A K |x), kg [0,1]. (5) 

Consider a system with a given preference parameter 
k which evolves with replicator dynamics in which 
a player with strategy s tinkers with the changes to 
the strategy by 5s, and accepts the changes in a prob- 
ability proportional to the resulting gain in the payoff 
(s + 5s|A K |x) — (s|A K |x). The time development of strat- 
egy x K is described by the M-dimensional Lotka-Volterra 
equation 

= 9 Si < S |A"|x«)| s=xK (6) 

= ft (A",x")-po(A K ,x"), 

with i = 1, M. We refer to this dynamics as an A K - 
constrained game, or simply an A K -game. Typically, af- 
ter some time period, x K approaches a stable fixed point 
X K , which we call A K -Nash equilibrium, that is obtained 
by the linear equation 

p i (A",X")= Po (A«,X"), (i \ \l, (7) 

Suppose we have an ensemble of systems with various val- 
ues of preference parameter n. If there is a selection pro- 
cess based on the payoff n(X K ) at work, the average pref- 
erence parameter k shall evolve toward k = K max that 



gives the maximum per capita system payoff n(JT Kmax ). 
For example, we can postulate 

m-=d K (X K \A\X K ), (8) 

K 

where m is a large number m 3> 1 that ensures slow 
secular variation of K in comparison to the variation of 
dynamical variable x K . We might alternatively consider 
the development of n by Newtonian dynamics, in which 
case — n(X K ) should be identified as the potential. 

Our task is reduced to evaluating the functional profile 
of n(X K ). Let us note the relation II(X K ) = pi(A K ,X K ) 
for arbitrary i, which is obtained from (J5J and J7J). Com- 
bining this with another equality 

Pk ( A K , X K ) = J2 X t K T, A Z X ? ( 9 ) 

i 3 
i 3 

= ptiA 1 -",* 1 -*), 

which is valid for arbitrary k and I, we obtain the altru- 
istic duality 

n(x K ) = n(x 1 - K ). (io) 

Namely, the per capita system payoff for an A K -game is 
exactly equal to that of its dual game A 1 ~ K . Specifi- 
cally, we have IT(X n )= n(X 1 ), an equivalence of payoff 
between a completely egoistic game A and a completely 
altruistic game . We stress that this duality is non- 
trivial, unlike, for example, the mere matrix symmetry 
iJSJ. One immediate result of (|10fl is that k = 1/2 has to 
be an extremum of the payoff II (X K ). If this is the sole 
maximum, we have the inequality 

n(x 1 / 2 ) ^ n(x K ) ^ n(x°). (11) 

In general, there could be other extrema and also k = 1/2 
could be a minimum. But we shall show in the follow- 
ing examples, that there are indeed interesting cases in 
which (flip) holds, and that examples include the clas- 
sic prisoner's dilemma. The maximum happiness of the 
maximum majority is achieved when every individual in 
the system is constrained to pursue an equal mixing of 
egoistic and altruistic payoff. In hindsight, this is to be 
expected, since direct maximization of the per capita sys- 
tem payoff results in symmetrization of the game matrix 
A; 

d Xi (x|A|x) = 9 Si (s|(A + At)|x)| s=x = 0. (12) 

We illustrate our results with two examples. First, 
consider a two-strategy (i = 0, 1) game whose payoffs are 
specified by the matrix 

*-(A'J")- <13) 
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FIG. 1: The v4 K -constrained prisoner's dilemma specified by 
the payoff table (13). The parameters are chosen to be a = 4, 
/3 = 1 and 7 = 2. In the middle graph, pi stands for Pi(A, X K ). 



where a, (3 and 7 are positive real numbers that satisfy 
the condition a> 7. This is the famous example of pris- 
oner's dilemma: When two players show a "good hand" , 
i = 1, both obtain the payoff of /3, but when one player 
betrays the other by playing a bad hand, i = 0, he gets 
the Devil's reward of a+(3 while imposing the damage —7 
on the opponent. When both players show a "bad hand" 
there is no payoff. Temporal evolution of ^-constrained 
game is described by the logistic equation 



[«(<* +/3 + 7) - 7 ]a£ - (a - ^{xtf 



(14) 



The evolutionary Nash equilibrium is fixed point Xf with 
an average payoff II (X K ) given respectively by 



X£ = - 
n(X K ) = - 



7 



a + /3 + 7 



a — 7 a — 7 

(a + /3)7 (a + /3 + 7) 2 



(15) 



7 



7 



k(1 



The per capita system payoff II(X K ) is indeed reflection- 
symmetric with respect to the line k = 1/2. An inter- 
esting quantity to look at is the difference pi(A,~K K ) — 
Po{A, X K ) = — (a + (3 + j)k. This is the payoff disparity 
between good and bad hands, which has to be tolerated 
by good-hand players to achieve an A K -game with a non- 
zero value of k. The peculiarity of this game is that both 
for a purely egoistic game A and for a purely altruis- 
tic game A 1 , the mixed strategy Nash equilibrium ltT5|) 
is located outside of a realizable domain < Xi < 1. 
Easy calculation shows that only k < k < K\ with 
kq = 7/ (a + (3 + 7) and k± — a /{a + j3 + 7) is allowed. 



These are the values that give Zf° = and X? 1 = 1. 
If k = 1/2 falls between them, the system eventually 
reaches this optimum state having 



X 



1/2 



a + (3 — 7 
2(«-7) ' 



n(x x / 2 ) 



{a+J3-jf 
4(a - 7) 



(16) 



Otherwise, the system settles for II(X K1 ) = (3. Figure 
1 depicts an example of the former case. We expect ex- 
perimental studies to be performed to check these pre- 
dictions. Also, comparison with numerical simulations 
with real strategies with memory (such as Tit-for-Tat or 
Pavlov) p4L IgL flOL KLll | would be beneficial. 

As our second example, we consider the following 
three-strategy (i = 0, 1, 2) game; 




A = I b b - a b - p 

—d fp — d —d 



(17) 



where a, b, d, f and p are positive real numbers. With 
strategy 0, a player abstains. With strategy 1, he/she 
produces worth valued at b. Worth is reduced by a when 
the opponent also produces worth because of overcrowd- 
ing. With strategy 2, the player wastes his/her resources 
valued at d. But he/she can derives worth valued at 
fp from the worth-producing opponent by way of raid- 
ing and dimininishing the opponent's worth by p. If we 
describe the system by three-strategy vector x, a nat- 
ural interpretation is that x\ represents the portion of 
total population which subsists on environmental riches 
("commoners"), and X2, the portion which tries to dom- 
inate the opponents ("knights"). Under the replicator 
dynamics with the A^-constrained game, the evolution 
of the system is governed by 

±1 = (1 - K)bx1 - a(a^) 2 - (1 - k - Kf)px1x% (18) 
*2 = -(l-K)dx% + (f-K-Kf)pxtx5. 

This is the classical Lotka-Volterra prey-predator system, 
of which the Nash equilibrium is obtained as 



XI = 



A 2 K = 



(1 - K)d 

(/ - K - Kf)p 

(!-") 
(1 - k - nf)p 



(19) 



ad 



(/ - « - Kf)p 



One complication is that, when k becomes larger than 
«* = (/ — ad/bp) /(I + /), the fixed point X% falls below 
zero and becomes unstable. Concurrently, however, there 
appears a new trivial Nash equilibrium, which is given by 



x: 



(1 - K)b 



A 2 K = 0. 



(20) 



This case corresponds to a single-species logistic evolu- 
tion, for which the game matrix is effectively reduced to 



A K = 



nb 
(!-«)& b-c 



(21) 
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FIG. 2: The three-strategy game specified by the payoff ma- 
trix (17) with yl K -constrained game dynamics. The parame- 
ters are chosen to be a = 1, b = 1, d = 1, / = 0.8 and p = 4. 
In the middle graph, pi stands for p-i(A, X K ), and po is identi- 
cally zero. At k — k*, the system displays the transition from 
prey-predator dynamics to single spieces logistic dynamics. 



In either case, the ^4 K -Nash equilibrium yields the per 
capita system payoff 



n(x K ) 



D 
T' 



(22) 



D^det(A^), T =£(-l)^A«(A K ), 



where Ay (A K ) is the minors of A K . 



An example of this model with specific parameters is 
depicted in Figure 2. The II (X K ) has a single peak at 
k = 1/2 as before, but because of the change in the na- 
ture of dynamics at k*, it is no longer symmetric. Sup- 
pose we start from a knight-commoner dynamics of k = 
game. The system is Pareto optimal in that payoffs for 
both commoners and knights are identical. Turning on 
non-zero k amounts to introducing "altruistic culture" . 
Curiously, this reduces the population capacity of knights 
and, at the same time, increase its payoff. For com- 
moners, it results in lager population and lower relative 
payoff. Although the overall per capita payoff increases 
quickly, the class disparity also increases with knights 
commanding ever higher relative payoff as the system 
becomes more altruistic. At n = n* however, the pop- 
ulation capacity for knights become zero and the "aris- 
tocratic" regime collapses. Above k* , we have a "demo- 
cratic" regime consisting of single self-sustaining popu- 
lation of commoners, ever prospering with increasing al- 
truism until the system hits the ceiling level at k — 1/2, 
which is the global stability point. 

In both of the above two examples, k = 1/2 turns out 
to be the only maximum in the allowed region. But in 
general, higher polynomials in (1221) for larger M could 
result in more structures in the II (X K ) profile for higher 
numbers of strategies. We should thus expect to find 
more complex dynamics. 

The indirect payoff maximization through the "com- 
munal" arrangement of social goods is widespread among 
ecosystems in which components have intellectual capac- 
ity. The examples we have studied in this article are 
simple toy models which do not necessarily have specific 
real- world counterparts. The fact that they show features 
reminiscent to concepts devised by the socio-economic 
philosophers in past centuries is rather intriguing. 

The author wishes to acknowledge his gratitude to 
Takuma Yamada, Toshiya Kawai and David Greene for 
helpful discussions and useful comments. 
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